Goto

Collaborating Authors

 decentralized learning algorithm


Global Update Tracking: A Decentralized Learning Algorithm for Heterogeneous Data

Neural Information Processing Systems

Decentralized learning enables the training of deep learning models over large distributed datasets generated at different locations, without the need for a central server. However, in practical scenarios, the data distribution across these devices can be significantly different, leading to a degradation in model performance. In this paper, we focus on designing a decentralized learning algorithm that is less susceptible to variations in data distribution across devices. We propose Global Update Tracking (GUT), a novel tracking-based method that aims to mitigate the impact of heterogeneous data in decentralized learning without introducing any communication overhead. We demonstrate the effectiveness of the proposed technique through an exhaustive set of experiments on various Computer Vision datasets (CIFAR-10, CIFAR-100, Fashion MNIST, and ImageNette), model architectures, and network topologies. Our experiments show that the proposed method achieves state-of-the-art performance for decentralized learning on heterogeneous data via a 1-6% improvement in test accuracy compared to other existing techniques.


Global Update Tracking: A Decentralized Learning Algorithm for Heterogeneous Data

Neural Information Processing Systems

Decentralized learning enables the training of deep learning models over large distributed datasets generated at different locations, without the need for a central server. However, in practical scenarios, the data distribution across these devices can be significantly different, leading to a degradation in model performance. In this paper, we focus on designing a decentralized learning algorithm that is less susceptible to variations in data distribution across devices. We propose Global Update Tracking (GUT), a novel tracking-based method that aims to mitigate the impact of heterogeneous data in decentralized learning without introducing any communication overhead. We demonstrate the effectiveness of the proposed technique through an exhaustive set of experiments on various Computer Vision datasets (CIFAR-10, CIFAR-100, Fashion MNIST, and ImageNette), model architectures, and network topologies. Our experiments show that the proposed method achieves state-of-the-art performance for decentralized learning on heterogeneous data via a 1-6% improvement in test accuracy compared to other existing techniques.


Decentralized Learning in General-sum Markov Games

Maheshwari, Chinmay, Wu, Manxi, Sastry, Shankar

arXiv.org Artificial Intelligence

The Markov game framework is widely used to model interactions among agents with heterogeneous utilities in dynamic, uncertain, societal-scale systems. In these settings, agents typically operate in a decentralized manner due to privacy and scalability concerns, often without knowledge of others' strategies. Designing decentralized learning algorithms that provably converge to rational outcomes remains challenging, especially beyond Markov zero-sum and potential games, which do not fully capture the mixed cooperative-competitive nature of real-world interactions. Our paper focuses on designing decentralized learning algorithms for general-sum Markov games, aiming to provide guarantees of convergence to approximate Nash equilibria. We introduce a Markov Near-Potential Function (MNPF), and show that MNPF plays a central role in the analysis of convergence of an actor-critic-based decentralized learning dynamics to approximate Nash equilibria. Our analysis leverages the two-timescale nature of actor-critic algorithms, where Q-function updates occur faster than policy updates. This result is further strengthened under certain regularity conditions and when the set of Nash equilibria is finite. Our findings provide a new perspective on the analysis of decentralized learning in multi-agent systems, addressing the complexities of real-world interactions.


iDML: Incentivized Decentralized Machine Learning

Yu, Haoxiang, Chen, Hsiao-Yuan, Lee, Sangsu, Vishwanath, Sriram, Zheng, Xi, Julien, Christine

arXiv.org Artificial Intelligence

With the rising emergence of decentralized and opportunistic approaches to machine learning, end devices are increasingly tasked with training deep learning models on-devices using crowd-sourced data that they collect themselves. These approaches are desirable from a resource consumption perspective and also from a privacy preservation perspective. When the devices benefit directly from the trained models, the incentives are implicit - contributing devices' resources are incentivized by the availability of the higher-accuracy model that results from collaboration. However, explicit incentive mechanisms must be provided when end-user devices are asked to contribute their resources (e.g., computation, communication, and data) to a task performed primarily for the benefit of others, e.g., training a model for a task that a neighbor device needs but the device owner is uninterested in. In this project, we propose a novel blockchain-based incentive mechanism for completely decentralized and opportunistic learning architectures. We leverage a smart contract not only for providing explicit incentives to end devices to participate in decentralized learning but also to create a fully decentralized mechanism to inspect and reflect on the behavior of the learning architecture.


The Non-IID Data Quagmire of Decentralized Machine Learning

Hsieh, Kevin, Phanishayee, Amar, Mutlu, Onur, Gibbons, Phillip B.

arXiv.org Machine Learning

Many large-scale machine learning (ML) applications need to train ML models over decentralized datasets that are generated at different devices and locations. These decentralized datasets pose a fundamental challenge to ML because they are typically generated in very different contexts, which leads to significant differences in data distribution across devices/locations (i.e., they are not independent and identically distributed (IID)). In this work, we take a step toward better understanding this challenge, by presenting the first detailed experimental study of the impact of such non-IID data on the decentralized training of deep neural networks (DNNs). Our study shows that: (i) the problem of non-IID data partitions is fundamental and pervasive, as it exists in all ML applications, DNN models, training datasets, and decentralized learning algorithms in our study; (ii) this problem is particularly difficult for DNN models with batch normalization layers; and (iii) the degree of deviation from IID (the skewness) is a key determinant of the difficulty level of the problem. With these findings in mind, we present SkewScout, a system-level approach that adapts the communication frequency of decentralized learning algorithms to the (skew-induced) accuracy loss between data partitions. We also show that group normalization can recover much of the skew-induced accuracy loss of batch normalization.